# Multi-Task Unified Framework
Lotus Depth D V1 0
Apache-2.0
Lotus is a diffusion-based vision foundation model focused on high-quality dense prediction tasks.
3D Vision
L
jingheya
135
4
Blip Image Captioning Base Football Finetuned
Bsd-3-clause
A vision-language model pre-trained on COCO and fine-tuned on a football dataset, proficient in generating image captions
Image-to-Text
Transformers

B
ybelkada
71
2
Featured Recommended AI Models